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O Abstract 

Conventional quantum mechanics with a complex Hilbert space and the Born Rule is derived from five 
axioms describing properties of probability distributions for the outcome of measurements. Axioms I, II, III 
are common to quantum mechanics and hidden variable theories. Axiom IV recognizes a phenomenon, first 
noted by Turing and von Neumann, in which the increase in entropy resulting from a measurement is reduced 
by a suitable intermediate measurement. This is shown to be impossible for local hidden variable theories. 
\s0 Axiom IV, together with the first three, almost suffice to deduce the conventional rules but allow some 

exotic, alternatives such as real or quaternionic quantum mechanics. Axiom V recognizes a property of the 
distribution of outcomes of random measurements on qubits which holds only in the complex Hilbert space 
model. It is then shown that the five axioms also imply the conventional rules for all dimensions. 

-f— > 

1 Introduction 

Because the conventional rules of quantum mechanics (CRQM) are abstract and unintuitive, there have been 
many attempts since their formulation by Dirac and von Neumann to derive them from axioms extracted 
from the experimental data. The axioms are supposed to isolate the essential ingredients of the data that 
require the CRQM. Specifically they should answer the following questions: (1) Why must we identify the 
set S of pure states with a Hilbert space and in particular a Hilbert space over the complex field (C) rather 
than the real field (R) or the quaternions (H)? (2) Why must we calculate probabilities by the Born Rule? 
(3) What precisely is it that excludes a hidden-variable explanation? 

We obtain an important clue to the construction of a satisfactory axiomatic system that addresses these 
questions when we recall that the celebrated quantum logic approach of Birkhoff and von Neumann [l] gives 
only a very limited restriction on the model. It endows the system with a projective space structure and 
hence gives meaning to "dimension" but imposes no restriction at all for dimension N = 2 (e.g. qubits or spin 
|) and almost none for N = 3 (e.g. spin 1). Even for N > 3, it only tells us that S is a projective space over 
some sort of skew field F, which is insufficient to derive the Born Rule by means of Gleason's Theorem [2]. 
rS Even if F is assumed to be C, Gleason's theorem only works for N > 2. 

The fact that the CRQM for qubits is the most difficult to derive suggests an axiomatic strategy based on 
information theory in which N — 2 subspaces play the fundamental role. Reinforcement for this strategy 
comes from the recognition [3j|5] that, even for a system prepared in a pure state, there are different probability 
distributions for the outcomes of different measurements. When this is taken into account, it was shown in [5] 
that the appropriate measure of the removal of uncertainty when the outcome of a measurement is known 
is not the von Neumann entropy, but rather the so-called information entropy computed by averaging the 
Shannon entropy over the outcomes of all possible measurements. Unlike the von Neumann entropy, the 
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information entropy is model dependent, and as the following table shows, can distinguish between real, 
complex, and quaternionic quantum mechanics in N = 2 subspaces. 

Pure state entropy for N = 2. 

K C H 

von Neumann entropy (1) 

information entropy 2 In 2 — 1 1/2 7/12 

The calculations can be found in section [7] below. 

We shall exploit these observations to construct a derivation of the CRQM. 

2 Preliminary observations 

The data consists of a table of the function 

p(x,y) = x(y)^ [0,1] (2) 

which gives the attenuation of a beam of particles prepared by a device labeled x after passage through a 
detecting device labeled y. In optics, for example, x labels the polarizers and y the analyzers . Since the 
same device may play either role, we use the same set S of labels for each argument. We refer to devices 
labeled by the elements of S as filters 18]. We refer to a table giving the values of x(y) for S as its p-table. 

We confine our discussion to systems which the CRQM describes by a Hilbert space of finite dimension N. 
With this restriction the CRQM may be stated as follows: 

The CRQM: There is a map z — > z from S to a complex projective space CP*" 1 for some finite N such 
that 

x(y) = \x*-y\ 2 (3) 

where 

z = z/\z\. 

We refer to this map as the Born Rule correspondence (BRC) between S and CP* -1 . 

Our axioms will be a minimal set of properties of a p-table from which we can deduce the CRQM as stated. 
Thus our axiomatization is based on the empirical data in contrast to that of Hardy UN which is based on 
what he describes as "reasonable axioms which might well have been posited without any particular access to 
the empirical data". 

One immediately observes two kinds of structure in the p-table of a quantum mechanical system: a metrical 
structure and a statistical structure. What is special about the p-table of a quantum mechanical system is 
the way these two structures interact. 

It will be helpful to have before us the p-table of a very simple system S with N — 2 which exhibits the 
properties we will examine in this section. This table is a skeleton of quantum mechanics in which there are 
just four filters x, x' , y, y' corresponding to states of linearly polarized light which the CRQM describes by 

x=(l,Q), £' = (0,1), y = 2-^(1, 1), y' = 2- 1 / 2 (l,-l). (4) 

The p-table for S Q is: 





X 


x' 


V 


</ 


X 


1 





1/2 


1/2 


X 1 





1 


1/2 


1/2 


y 


1/2 


1/2 


1 





y' 


1/2 


1/2 





1 



The metrical structure is seen in the resemblance between the p-table and a road atlas which gives the 
distances between a set of cities. Our problem is analogous to that of determining the geometry of the earth 
from such an atlas. Like a road atlas, which has the special value zero on and only on the diagonal, the 
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p-table has the special value unity on and only on the diagonal. Like an atlas it is also symmetric about 
the diagonal. These properties of S are shared by the p-table of any quantum mechanical system and are 
expressed by our first axiom: 

Axiom I: For every w, z <E S: 

w{z) = 1 if and only if w = z, (6) 
w(z) — z(w). 



The conditional probability that a particle in a beam prepared by a u filter will pass a w filter given that 
it has first passed a v filter is u(v)v(w). In a classical system the result would be the same if v and w were 
interchanged. It follows from Q that this is the case for all u if and only if either v(w) — 1 or v(w) = 0. In 
the former case the the two elements are identical. 

Definition 2.1. Two elements u, v such that u(v) — are said to be orthogonal and we write u _L v . If u,v 
are either identical or orthogonal they are said to be "compatible" or "classically related". If they are neither 
identical nor orthogonal we write u(pv. 



In S observe that x0y,x$y',x'0y, and x'lfiy 1 so that S a cannot be decomposed into the union of two 
subsets in which the elements of one are classically related to all members of the other. 



Definition 2.2. A frame in a subset S* of S is a maximal set of mutually orthogonal elements. 

It follows from ^ that if x(z) = y(z) for all z, then x(y) = 1, and hence x — y. Thus there is a one-one 
correspondence between the elements x and the functions x( ), and there is a natural metric on S: 

d(x,y) = sup \x(z) -y(z)\. (7) 
Proposition 2.1. The functions u( ) are continuous with respect to the topology defined by the d-metric. 



Proof. 



\u(v) — u(w)\ = \v(u) 



w(u)\ < sup \ v(u) 
ues 



w(u)\ = d(v, w). 



(8) 
□ 



Note: Since there is a one-one correspondence u ■<-> u( ), we shall often use u to mean u( ) when no confusion 
will result. 

Observe that d[u, v) takes values between zero and unity, the former if and only if u and v are identical and 
the latter if and only if u and v are orthogonal. Thus the elements of a frame are a maximal set of maximally 
separated elements in the d-metric. 

The following axiom insures that the number of elements in any frame is finite and that S is complete. 
Axiom II: S is compact in the d-metric. 



Proposition 2.2. The number of elements in any frame is finite. 



Proof. A compact metric space is closed and totally bounded 12 . The latter implies that at most a finite 
number of elements can be separated by any constant amount. □ 

Note: It is this axiom that will restrict us to systems represented in the CRQM by finite dimensional Hilbert 
spaces. The closure property will play a role later in the discussion. 

A frame consisting of the elements x = {x±,X2i •••} will be denoted F^. In S there are just two frames 
F x = {x, x'} and F y = {y, y'}. 
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Definition 2.3. A frame function on S* is a probabiity distribution function on every frame in S* , i.e. a 
function £ from S* to [0, 1] such that 

((xj) = 1 for every frame F x . (9) 

3 

Note: We use the term "frame function" for what Gleason p] calls a non-negative frame function of weight 
unity. 

Axiom III: The function a(x) is a frame function on S for every a G S . 

We refer to frame functions formed in this way from the elements of S as pure states. 

The elements of a frame are the possible outcomes of some measurement. The physical interpretation of 
Axiom III is that a(xj) is the probability of the outcome Xj when the measurement is performed on a system 
in the state a. In S a there are four pure states x, x' , y, y' . 

In view of the symmetry of p, Axiom III also says that for every frame _F X in S we have 

Xj(a) = 1 for every element a G S. (10) 

3 

A set of functions with this property is said to be a basis of S, whence Axiom III is equivalent to the assertion 
that every frame in S is a basis of S. 

We are led to the notion of "subspace" by considering subsets S* of S for which a given (not necessarily 
maximal) set of mutually orthogonal elements is a basis. For elements in this subset, the only possible 
outcomes of a specific measurement will be one of the finite set of elements of the basis. 

Definition 2.4. Let F£ = {x%, ■ • ■ ,xn} be a set of mutually orthogonal elements of S. The subspace S* of 
S spanned by F£ consists of elements z for which F* is a basis i. e. for which 

N 

J £x j (z) = l. (11) 
i=i 

Evidently {x\, • • • , xn} is a maximal orthogonal set in S* since no clement can be orthogonal to all of them 
and satisfy (111. Hence F* = {x\, ■ ■ ■ ,xn} is a frame in S* . 

Proposition 2.3. An element orthogonal to every element of a frame in S* is orthogonal to every element 
ofS*. 

Proof. If y is orthogonal to a frame F x = {x\, • • • , x^} on S* . then {y, x\, • • • , xn} is a subset of some 
maximal set of mutually orthogonal elements of S . Hence by (1 101) X)j=i x j(z) +y(z) < 1. But the first term 
is unity for z E S* whence y(z) =0. □ 

Proposition 2.4. Every frame in S* has the same number of elements and spans S* . Moreover, if 
{xi,X2,--- ,xn} and {yi,y2, ■ ■ ■ ,J/at} are two such frames, then 

N N 

22<yj(w) =^2xj(w) for all w G S (12) 

3=1 j'=i 

with common value unity if and only if w G S* . 



Proof. Extend i 7 ^ to a frame F x = {xi,x 2 ,--- , a^jy, ^jv+i) " " } m «5. By proposition 2.3 the elements 
{ajjv+i) • • • } ar e orthogonal to every element of S* and hence to any maximal set {2/1,2/2, •••} of mutually 
orthogonal elements of S* . If an element z is orthogonal to ijv+i, • • • , then Ylj=i x j( z ) = 1, so z G <S* and 
hence cannot be orthogonal to {2/1,2/2, ■ ■ ■ } since this is a maximal set of mutually orthogonal elements of S* . 
Hence {2/1, 2/2, • • • , %n+i, ■ ■ • } is a maximal set of mutually orthogonal elements of 5, so that for all w G S 
we have 



00 



^yjH+ 2^ ziH = i. (13) 

3 j=N+l 
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Since x j( w ) + X^Ljv+i x j( w ) = 1 it follows that ^jDj{w) — Ylj=i x j( w ) with common value unity 

if and only if w e 5*. Thus F* = {yi , J/2) ' " } spans S* , and it remains to prove that this set contains N 
elements. Let {xi, • • • , xn} and {yi, • • • , j/m} be frames on S* so that for any z, w £ S* 

N M 

^x 4 (z) = l = ^ % H. (14) 

i=i j=i 

Put z = 2/j on the left side and sum on j to obtain M, and put w = Xi on the right side and sum on i to 
obtain N. The two double sums are equal by the symmetry ([6]). □ 

Definition 2.5. The number N is referred to as the dimension of the subspace S* . 

Corollary 2.5. For every element x of a two-dimensional subspace, there is a unique element x' to which it 
is orthogonal. 

Proof. If x _L x' and x _L x", then i(ar") + a;' (a;") = 1 => x'[x") = 1, i.e. x' = x" by ^. □ 
We refer to x and x' as antipodes and indicate the two dimensional space that they span by V xx ' ■ 

3 Effect of measurements on states. 

The frame function property of the elements a e S means that the probability bj(a) sums to unity for a 
particle in a beam prepared by an a-filter to pass one of the filters b\ , 62 , • • • of a frame . The fact that 
b(b) = 1 for any b means that a beam which is unattenuated by a b filter will be unattenuated by a second b 
filter. Thus a determination of which filter of frame F^ a system in state a passes transforms the probability 
distribution function a into the probability distribution function F^a given by: 

a -> F h a = bj(a)bj. (15) 
j 

We shall refer to this transformation as a measurement of a by -Fit,- It is the transformation referred to in 
quantum mechanics as "collapse of the wave function" . 

In view of Axiom III, Fi,a is a frame function, since if p\,pi,--- are frame functions so also is any convex 
linear combination of them i.e. any function Q — a jPj with < a 3 < 1 and Y]j a,j = 1. Frame functions 
such as F b a that result from measurements are of a special type in that they are composed of orthogonal 
pure states. We refer to these as mixed states. The term "state" will be used to refer to either a pure state 
or a mixed state frame function. 

Note: It is important to keep in mind that we cannot assume at this stage that frame functions composed of 
convex combinations of non- orthogonal pure states can be written as convex combinations of orthogonal pure 
states — something we know to be a consequence in the CRQM of the diagonalizability of convex combinations 
of projection operators representing pure states. 

The mixed state p = ■ ctjbj has the property that it is unchanged by a measurement of Fb, i-e. 

F h p = F bYl a i h i = H a 3 F ^3 = Yl a i b i = P ( 16 ) 
3 3 3 

but will change for other choices of the frame. 

The Shannon entropy of a probability distribution (q±, q2, • • • ) 

S[( qi ,q 2 , ■■■)} = - J2 qj ]n gj . (17) 

3 

measures the removal of uncertainty when an outcome is known. As noted above, Stotland et al [5] observed 
that if we do not know in advance what measurement is going to be made on a state p, then to compute 
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the removal of uncertainty when a measurement is made on it we should take the average of the Shannon 
entropy of the outcome distributions from all possible meaurements. This average is called the "information 
entropy" S[p\. In contrast, the von Neumann entropy H[p] considers only the distribution of outcomes when 
the measurement does not change the state, i.e. 

i i 

Both the von Neumann entropy and the information entropy provide a measure of the purity of a mixed 
state. Let us compare them for states belonging to the skeletal system S given by the p-table 

In iS there are just two frames F x and F y . We have F x x = x and F y y = y. There appear to be two 
mixed states F x y = \x + \x' and F y x = \y + \y' , but these are in fact identical. To see this observe that 
\x(z) + ^x'(z) = \ = -^(z(x) + z{x')) for every z and the same with y replacing x. The von Neumann entropy 
of the pure states is zero and of the mixed state is In 2. The information entropy is | In 2 for the pure states 
and In 2 for the mixed state. 

In S Q we see a rudimentary form of interference i. e. it can happen that two states formed in different ways 



(in this case the states F x y and F y x) turn out to be identical. More generally it follows from (12) that if F* 
and F* are frames in the same iV dimensional subspace, then the maximally mixed states in that subspace 
are identical, i.e. 

W j N 

,\» ,y» ( 19 ) 

3=1 3=1 

Although S exhibits this indistinguishability of maximally mixed states, it is not a quantum mechanical 
system. This may be seen by observing that its statistical behavior can be produced by the following hidden 
variable model: Let £ be a hidden variable that takes on values on a circle. When £ is on the right and left 
halves of the circle, a particle passes filters x, x' respectively, and when £ is on the upper and lower halves of 
the circle, it passes y, y' respectively. The conditional probability that £ be on the right half, given that it is 
in the upper half is 1/2. Similarly all of the values in the table are predicted correctly. 

The S model can be generalized to one with sets of 2(R + 1) elements x%, X2, • • • , cc_r+i and x'^, x' 2 , • • • , x' R+1 
such that xj(xk) = \ and Xj(x' k ) — | for j ^ k. Just as S which has R = 1 is the skeleton Q of two 
dimensional quantum mechanics over the real number field, the case R — 2 is a skeleton of two-dimensional 
complex quantum mechanics with xi,X2,x^ corresponding to (1, 0), 2~ 1 / 2 (1, 1), 2~ 1 / 2 (1, i) respectively and 
XiyX^fX's corresponding to (0, 1), 2 _1 ' 2 (1, — 1), 2 -1 ' 2 (l, — i) respectively. The case R = 4 is a skeleton of 
quaternionic quantum mechanics with additional elements x±, x$ and x' 4 ,x' 5 corresponding to 2 -1 / 2 (l,±j) 
and 2- 1 / 2 {l,±k) where l,i ,j,k are the quaternionic units. None of these skeletons are quantum mechanical 
since, as for S , the function x(y) can be reproduced by a hidden- variable £ ranging over an i?-sphere: If the 
cartesian coordinates of £ are (£].,••■ , £r+i), a particle passes an xj or x'a filter according as £j is positive or 
negative. 



4 The entropic Turing-von Neumann effect 

We see then that there is a crucial ingredient of quantum mechanics not accounted for in S Q or any of these 
generalizations. A clue to the missing ingredient comes from a comparison of the information entropy of a 
pure state in the three skeletal models with that of real, complex, and quaternionic quantum mechanics given 
in the following table: 

Information entropy for N = 2 pure states 

K C H 

skeleton Mn2 |ln2 ^ln2 



2 " 3 



qm 21n2-l 1/2 7/12 

qm - skeleton 0.040 0.037 0.029 



(20) 
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We see that the quantum mechanical values are slightly larger than those of the skeletons, which tells us 
that there have to be more possible measurement frames to account for it. The effect of a measurement 
described by the frames F x and F y in S is either to leave a pure state alone or transform it into the 
maximally mixed state. Thus the von Neumann entropy change is either zero or In 2 in any measurement. In 
quantum mechanical systems, however, we observe that the increase in von Neumann entropy resulting from 
a measurement can be reduced by making a suitable intermediate measurement. This phenomenon, which 
was first noted by Turing 10 and elaborated by von Neumann (Til is related to the so-called Zeno effect in 



which the dynamic evolution of a state is arrested by continuous monitoring. 

In order for the increase in von Neumann entropy resulting from the measurement of Fy, on a to be reduced 
by some intermediate meaasurement, there must be a frame F c such that 

H(F b F c a) < H(F h a). (21) 

For F c to accomplish this, it must transform a into a state which in some sense lies "between" a and b. A 
natural candidate for such a state is one of the form p = aa + (1 — a)b with < a < 1. Although this is not a 
combination of orthogonal pure states, we have seen above that it is possible for two frame functions formed 
in different ways to be identical, and hence some measurement of F c on a can in principle produce a frame 
function identical to p. The resulting entropy will be smallest if the combination involves just a single pair 
of orthogonal states c, d. The following axiom asserts that such a measurement does indeed exist, and, as 
we shall see, this will produce the entropic Turing-von Neumann effect. It is the key property distinguishing 
quantum mechanical from classical systems. 

Axiom IV: Given any two distinct elements a,b £ S there exists a pair of orthogonal elements c, d and a 
number < a < 1 such that 

a(c)c + a{d)d = aa+ (1 - a)b. (22) 
The following are consequences of Axiom IV and Axiom I: 

Proposition 4.1. 

a(c)+a(d) = l. (23) 

b(c) = a(c), b(d)=a(d). (24) 

a=|. (25) 

a(c) = l(l±Va(6)). (26) 



Proof. The left side of (22 1 is a frame function since the right side is. Hence equation (23) holds. Substitution 



of z = c in the arguments of the functions c, d, a, b in ( |22[ ) gives 

a(c) = a a(c) + (1 - a)b(c), (27) 



whence a(c) = 6(c) since a^l. Similarly substitution of z — d gives a(d) = b(d) proving (24). Substituting 
z = a and z — b in the arguments and using ( 23|24 | we have 



a(c) 2 + (1 - a(c)) 2 = a + (1 - a)b(a), (28) 
a(c) 2 + (1 - a(c)) 2 = a a(b) + (1 - a, 



whence, since a(b) ^ 1 for a ^ b, we obtain (25) and p6[). □ 



Since the interchange of c and d changes a(c) to a(c') = 1 — a(c) we can choose c and d such that a(c) > |, 



i.e. assume the solution with the + in ( 26 1 . 
Hence we have: 

Equivalent form for Axiom IV 

Given a ^ b there exist orthogonal states c, d and a number X such that 

|(o + 6) = Xc+(1-X)d with \ < A< 1. (29) 
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It will then follow from the substitutions used above to obtain ( 23 - 26 ) that 



A = a{c) = 6(c) = |(1 + \fa{Jb)). 



(30) 



Note that \29y is a second example of an interference phenomenon in which two frame functions formed 
in different ways turn out to be identical. What it says is that every mixture of two orthogonal states 
is indistinguishable from some equal mixture of two states. (Its validity in the CRQM follows from the 
diagonalizability of density matrices.) 



We shall often use the special form taken by ( 29 ) when the argument of the functions a, b, c, c' is a member 

(31) 



of the subspace V cc ' spanned by c, d so that d(z) = 1 — c(z), namely: 

\{a{z) + b(z)) = (2A - l)c(z) + (1 — A), z £ V, 



From ( 23|24 ) this applies to z = a, b and of course to z = 



We next derive a number of propositions that follow from Axioms I-IV that will play a role in the subsequent 
analysis: 



Proposition 4.2. If a and b are not orthogonal the solution c,d of (29) is unique 



Proof. If a, b are not orthogonal, then A ^ |. If g, g' are also solutions of ( 29 ) it follows from ( 31 ) with z = g 
that (2A — l)c(g) = (2A — l)g(g) from which c(g) = 1 and hence g — c . □ 



Note: When we wish to emphasize that c, d and A in ( 29 ) are uniquely determined by a non-orthogonal pair 
a, 6, we will write c = c(a, b),d = d(a, b) and A = A(a, b). 

Proposition 4.3. A pair of distinct states a, b belong to a unique two dimensional subspace V a b- 



Proof. It follows from ( 23|24 | that a,b belong to V cc > Suppose a and b belong to another two dimensional 
subspace V gg >. Add the two equations obtained from (29 1 by setting z — g and z = g 
left sides will sum to unity, and we obtain 



Since a, b £ V gg ' , the 



l = X(c(g)+d(g)) + (l-X)(c(g') + d(g')). 



(32) 



Since there is some maximal set of mutually orthogonal elements containing c,d, the coefficients of A and 
(1 — A) lie between zero and unity as does A. Hence the only solution is for both coefficients to be unity. 
Hence g,g' must lie in the subspace V cc >, and hence by proposition 
identical. 



2.4 



the sub-spaces V qq > and V cc > are 



□ 



Corollary 4.4. Two two-dimensional subspaces are either disjoint, identical, or have just one point in 
common. 

Proposition 4.5. If x is orthogonal to two distinct states a,b, it is orthogonal to every member of Vab- 



Proof. From (29) if x _L a and a; 1 6, then x _L c and x _L d. Hence by proposition 2.3 x is orthogonal to the 



subspace spanned by c and d i.e. it is orthogonal to Vab 



□ 



Definition 4.1. Two elements x,y are said to be "mutually equatorial" if x(y) — |, and we write x - — y. 
The set of elements equatorial to a given element z is referred to as the equator opposite z and denoted £ z . 

Proposition 4.6. If a<pb belong to a two-dimensional subspace V , and a,b £ £ z for some z £ V , then 
c(a,b) £ £{z). 



Proof. Solve (31 1 for c(z) with a(z) = b(z) = |. Noting that A ^ | for a(fib , we obtain c(z) = 1. 



□ 



Proposition 4.7. Let a and b be distinct and non-orthogonal, and let b' be the antipode of b in Vab- Then 

c(a,b) c(a,b'). (33) 
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Proof. Since a(b) ^ or 1 implies that a(V) 7^ or 1, it follows from (29) that there exists a unique orthogonal 
pair e, e' such that 

\{a{z) + b'(z)) = je(z) + (1 - 7)e/(z), (34) 

where 

e = c(a,b'), e' = c'(a,b'), 7 = X(a, b') — \(\ + \rp(a, b')). (35) 

Since a and b' belong to V ee > as well as V ab , it follows from proposition |4.4| that e, e' belong to V ab . Thus 
both e, e' and b, b' span P a b, whence from proposition |2.4| 

(36) 



6(z) + b'(z) = e(z) + e'(z) for all zeS. 



Combining this with ( 34 1 , we obtain 



a(z)-b(z) = (2 7 -l)(e(z)-e'(z)). 



(37) 



Since a(b') ^ 0, it follows that 7 7^ |. Putting 2 = c(a,b) the left side is zero by (29 1, whence e(c) = e'(c) 

1 — 

2 • 



But e(c) + e'(c) = 1 since c € V e e', whence e(c) 



□ 



Since Axiom IV was motivated by the entropic Turing-von Neumann effect we must verify that it does indeed 
follow from it, i.e. we must show that there is an intermediate measurement that reduces the increase of von 
Neumann entropy caused by a measurement: 

Lemma 4.7.1. Let p = aa + (1 — a)a' be a state in the two dimensional subspace V, and let a be the 
maximally mixed state. The von Neumann entropy H(p) increases monotonically as the distance d{p, a) 
decreases. 



Proof. By (19) a = \a 



\a' so that 



d(p, a) = sup \p(z) — cr(z)\ = \a — h\ swp z \a(z) — a'{z)\ — \a — \\. 



The assertion follows from the fact that 

/(/i) = -(Mln/i+(l-/x) ln(l-/i)) 
is a monotonically decreasing function of \p — ^| on the interval < fx < 1. 
Proposition 4.8. If a =/= b, there is a measurement F* = {x,x'} with x G V a b such that 

H(F b F x a) < H(F b a). 



(38) 

(39) 
□ 

(40) 



Proof. Let a be the maxi mally mixed state in V a b- Let c = c{a,b) and e = c(a, b') as defined in the 
note following Proposition 4.2 If a(b) > i, choose x = c, and if a(b) < ^ choose x = e. We have: 



F b a = rb + (1 — r)b' with r = a(b), whence d(F b a,cr) = |r — || = \a(b) — ||. On the other hand F b F c a 
F b {\(a + b)) = fib + (1 - [i)b? with ^ = |(1 + a(b)), whence d(F b F c a,a) = \fi- \\ = a(b)/2. If a(b ) > \, 
then o(6)/2 > \a(b) 



so that d(F b F c a,a) > d(F b a,a), and hence H(F b F c a) < H(F b a) by Lemma 



4.7.1 



If a(b) < h, the replacement of F c by F e replaces b with 6', and since a(b') > |, the same conclusion is 



obtained. 



□ 



Although the metric d(x,y) as defined by ^ seems to require a knowledge of x(z) and y(z) for all z, we 
shall see as a by-product of (37) that it is determined by x(y) alone, and the form of that dependence is 



inconsistent with local hidden variables: 
Proposition 4.9. 



d(a, b) = yT— a(b) for all a, b 6 S. 



(41) 



Proof. We can assume a(&) ^ or 1, since otherwise the assertion follows from the definition ([7| of d. By 



proposition 



2.5 



b has a unique antipode b' in T^b- Take the supremum over z of the absolute value on both 
sides of (|37|) to obtain: 



d(a, b) = |2 7 - l|d(e, e') = |2 7 - 1| = ^0(6') = ^1 -a(h). 



(42) 
□ 
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Proposition 4.10. If the property (29) holds for S, its p-table cannot be reproduced by a local hidden variable 
model unless it is classical i.e. unless p only assumes the values zero and unity. 

Proof. It is known [13] (see proof in the Appendix) that the relation between d(a, b) and a{b) in any hidden- 
variable model differs from (41 1 in that the right side is 1 — a(b) rather than \J\ — a(b). Hence such models 



are inconsistent with Axiom IV unless a(b) only assumes the values and 1 for all elements, i.e. the system 
is classical. □ 



Comment 4.1. Without the square root the triangle inequality for d becomes 

x(z) + z(y) < l + x(y), 



(43) 



which is Bell's inequality. It is also shown in the Appendix that the CRQM gives [4-D- This strongly suggests 
that property (29) has brought us close to the CRQM. 



5 Structure of two dimensional subspaces 

We shall now exploit Axioms I-IV to show that two dimensional subspaces are isometric to spheres. It is 
important to distinguish the dimension N — 2 of the subspace from the dimension of the spheres which, as 
we shall see, is determined by the "rank" of the subspace given by the following: 

Definition 5.1. The rank R of a two dimensional subspace is one less than the maximum number of elements 
in a mutually equatorial set. 

Proposition 5.1. IfV is a two dimensional subspace of rank R, there is a one-one correspondence x — > X 
between the elements xofV and the points X of a unit R-sphere such that 

x(y) = cos 2 \XY (44) 

where XY is the angle subtended at the center of the sphere by the arc joining X and Y . 

Proof. We shall need several lemmas: 
Define 9(x, y) by 

x{y) = cos 2 \(9{x,y). (45) 

Definition 5.2. A set of elements xi,X2, , • • • is said to be properly mapped to the points X\, X2, ■ • • of 

an R-sphere if 9{xi, Xj) — XiXj for all i,j. 

To prove Proposition |5.1| we shall prove that there is a one-one proper mapping of the elements of V to the 
points of an i?-sphere. 



Because we are restricting the elements to a two dimensional subspace, we can use the special form (311 of 



Axiom IV. Rewriting this in terms of 9 defined by (45), it becomes 



cos 9(a, z) + cos 9(b, z) — 2 cos \0(a, b) cos 9(c, z), (46) 

and 

9{a,c) = 9(c,b) = \9(a,b). (47) 



The geometric significance of ( 46 ) and (|47| is revealed by the following lemma which applies to an i?-sphere 
for any R. 

Lemma 5.1.1. Let A,B,C, Z be points on an R-sphere, with C the midpoint of the arc joining A and B. 
Then 

cos(AZ) + cos(SZ) = 2cos(|AB)) cos(CZ). (48) 
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Proof. If A, B are antipodes, the right and left sides both vanish. Assume they are not antipodes. Let 
x^, xb,xz be unit vectors from the center to the points A, B, Z respectively of a unit i?-sphere. The vector 
x = \{~x.a + xb) connects the center to the midpoint of the chord joining A, B. Its length is 

|x| = y/(l+xL A .& B )/2 = cos(iAB) 

which does not vanish if A and B are not antipodes. Hence xc = ((xa + xb)/(2cos(|AB)) is a unit vector 
from the center to the midpoint C of the great circle arc joining A and B. Hence if CZ is the arc joining C 
and Z we have 

cos(CZ) = x c • x z = (x A • ±z +x B • x z )/(2cos(AAB)) = 
(cos(AZ) + cos(5Z))/(2cos(±AB)) 

□ 



whence ( 48 1 follows 



Lemma 5.1.2. Suppose that a subset of the two-dimensional subspace V consisting of Z\, z 2 , ■ ■ ■ is properly 
mapped to the set of points Z\, Z 2 , ■ ■ ■ on the R-sphere Sr. Then the subset consisting of z\ 1 Z2, 



' ' ' j z Xi z 2) ' ' ' 



where z'j is the antipode of Zj on V, is properly mapped to Z\, Z 2l ■ ■ ■ , Z[, Z 2 , • ■ ■ , where Z'^ and Zj 
antipodes on Sr. 

Proof. For any elements x, y of the set z%, z 2 , ■ • ■ , z[, z' 2 , • • • , we have 

x'(y) = 1 - x{y) - sin 2 (^(a;, y)/2) = cos 2 ((9(x, y) + vr)/2) = 
cos 2 ((XY + tt)/2) = cos 2 (X'F/2) 

whence 

e{x',y)=X'Y. 

We refer to the result of this lemma as antipode adjunction to a properly mapped subset. 



arc 



□ 



Lemma 5.1.3. Suppose that a subset of the two-dimensional subspace V consisting of a(pb and z\,z 2 ,--- 
can be properly mapped to the points {A, B, Z\, Z 2 , ■ ■ ■ } on Sr. Let C be the midpoint of the shorter great 



circle arc joining A, B, and let c — c(a, b) of (29). Then the mapping c —> C extends the proper mapping to 
the set {a, b, c, Z\, z 2 , ■ ■ ■ }. 



Proof. Let z be any of the Zj's. If {a, 6, z} — ¥ {A, B, Z} is a proper mapping, it follows from (31 1 that 



cos{AZ) + cos(BZ) = 2cos(|AB)cos(0(c,2:))- 



If a<pb, then AB ^ ir so that cos(l AB) ^ 0, whence from (31) we have 



(c,z) = CZ, 



and from (47), 



9(a, c) = \6{a, b) = \AB = AC, 8{c, b) = \0(a, b) = \AB = CB. 
Hence c — > C extends the proper mapping {a, b, z} —> {A, B, Z} to {c, a, b, z} —> {C, A, B, Z}. 



(49) 

(50) 

(51) 

□ 



We call the result of this lemma midpoint adjunction to a properly mapped subset. 



Lemma 5.1.4. Suppose that a subset K. of the two-dimensional subspace V can be properly mapped to a unit 
R-sphere Sr. If a&b £ JC, there exists a subset C a b ofV containing a and b, referred to as a circular subset, 
which can be properly mapped to a great circle Cab on Sr in such a way that the union of C a b and JC is 
properly mapped to Sr . If a and b belong to the equator £{z) of some state z € V , then every element y £ C a f, 
also belongs to £{z). 
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Proof. Let V be a two-dimensional subspacc with elements a<fib . As shown in Figure 1: Let a and b be 
mapped to points A, B of a unit great circle Cab on an i?-sphere separated by an arc 6(a, b). By definition the 
set {a, b} is properly mapped. By antipode adjunction the set {a, b, a' , b'} is properly mapped to A, B, A' , £?'. 
Let ei = c(a,b) and e2 = c(a,b') be mapped to the midpoints Ei,E 2 of the arcs joining A, B and A, B' 
respectively and their antipodes to the points opposite. As noted earlier, e\ and e 2 are an equatorial pair. 
By the midpoint adjunction the set {a, b, e\, e-ii a 1 , b', e[, e' 2 } is now properly mapped to Cab- 



Figure 1 



The four points E\, E 2 , E[, E' 2 are at points that are equally spaced by ir/2. We can apply midpoint adjunction 
k times to extend the proper mapping to include elements of V whose images are spaced by 7r/2 fe . With k 
arbitrarily large the images become dense on Cab in the round-metric. The completion of this set of elements 
in the d-metric is the subset C a b of V . From proposition 4.6 if a<fib £ £{z), then ei,e2 and all successive 



elements of C a b obtained by midpoint adjunction will be in £(z). □ 
Note: Here we used Axiom II to insure the completeness of V in the d-metric. 

Corollary 5.2. If x £ C a b there is a unique pair of states y,y' £ C a b such that x ' y and x ^ y' . 

Proof. The required states {y, y'} are the points of C a b with images {Y, Y'} at the end points of the diameter 
of Cab at right angles to the diameter joining the images {X, X'} of {x,x'}. □ 



We now combine the results of the above lemmas to complete the proof of proposition |5.1| 

Let V be of rank R so that there are n = R + 1 elements in a maximal set of mutually equatorial elements. 
Let ei, e%, ■ ■ ■ ,e n ofV be any such set. It can be properly mapped to an to a unit R sphere Sr by placing 
the images Ej of ej for j = 1, • • • , n at the points of Sr with i'th cartesian coordinate <5y. By Lemma 5.1.4 



the proper mapping can be extended to include a subset V* of V with the property that it contains C xy for 
every pair x<Py of its elements. We shall prove that V* = V and that the set of images of V covers Sr. 

Suppose first that there is some x £ V such that x ^ V* . Then the circular set C eiX can contain no point 
other than e\ of V* for otherwise x would be in V* . Hence, by the Corollary to Lemma 5.1.4 there is an 
element f\ ^ V* but f\ £ C eiX such that f\ ^ e\. Now the circular set C e2 f 1 may contain no element of V* 
other than e 2 for otherwise f\ would belong to V* . Since f\ ^ e^ and e 2 ^ e x it follows from Lemma |5.1.4| 
that every element / of C e2 f 1 satisfies / ' e\. In particular this is true of an element f 2 of C e2 / 17 which 
exists by the last part of Lemma 5.1.4 that satisfies f 2 - C2- Thus e±, e2, fi is a mutually equatorial set in 
which f 2 is not an element of V* since otherwise f\ would be a member. Next we construct C e3 / 2 and repeat 
this process until we produce a state /„ which belongs to V but does not belong to V* and has the property 
that ei,e 2 , ■ ■ ■ , e„, /„ is a mutually equatorial set. But this makes the rank of V larger than R which is a 
contradiction. This proves that V = V* . 

Proof that Sr is completely covered by the mapping is obtained by essentially the same argument: Suppose 
that some point X on the sphere does not appear. Then no point Y other than Ei on the great circle Ce 1 x 
can appear. For if Y were the image of some y £ V, then the entire circle would appear as the image of the 
circular set C eiy . In particular there is a point F\ on Ce x x which is equatorial to E\ and does not occur in 
the mapping. Similarly no point on Ce 2 F! can occur, and every point on it is equatorial to E\. In particular 
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it contains a point F 2 which is equatorial to both E± and E%. Proceeding in this way we construct a point 
F n on Sr which is equatorial to E%, ■ ■ ■ , E n which is impossible. 

With this result we have established that there is a one-one proper mapping of the completion of V of rank 
R to the points of an i?-sphere which proves proposition [571] □ 



6 Two dimensional subspaces of rank R = 2 

We next make the connection between two dimensional subspaces with rank R — 2 and the CRQM for N=2 
subspaces. 

Proposition 6.1. There is a one-one correspondence between the points x of a two dimensional subspace of 
S of rank R = 2 and the points x of CP 1 such that 

xi(x 2 ) = \x\ ■ x 2 \ 2 where x = x/\x\ (52) 

which is the Born Rule. 

Proof. The space CP 1 is the linear space over the complex numbers with elements represented by two 
homogeneous coordinates i.e. pairs x = {ai,0!2,}, not both of which are zero, such that two pairs are 
identified if their elements differ by a common non-zero complex factor. 

There is a one-one correspondence X O x between the points of a unit 2-sphere and the points of CP 1 
obtained by stereographic projection, i.e. the correspondence X{9, <fi) O i = xj\x\ between a point on a unit 
2-sphere with zenith 9 and azimuth cf> and the point of CP 1 defined by 

x = (cos i0,e^ sin i<9). (53) 

Let xi and x 2 correspond to Xi = X(9 1 ,4>i) and X 2 = X(9 2 ,<fi 2 ), and let £i and £2 be the cartesian 
components of unit vectors from the center to X\ and X 2 respectively. Then 

\x\-x 2 \ 2 = |(l + 6 -6) =cos\\X x X 2 \ (54) 

where X\X 2 is the great circle arc between points X\ and X 2 . 



The assertion then follows from ( 44 ) . □ 



Comment 6.1. A similar result is obtained for ranks R — 1 and R — 4. In the former case one maps the 
1-sphere to the real projective line M.P , and in the latter one maps the 4-sphere to the quaternionic projective 
line HP 1 . 



7 Picking rank R = 2 

To complete our axiomatic system we must choose a property of two dimensional quantum mechanical systems 



which holds only for rank R = 2. One such property pointed out in 15 is that the group of transformations 
leaving a basis fixed is a continuous, abelian group. However, since our focus in this paper is on information 
theory, we shall take a different approach based on a property noticed by Sykora [3] and Wootters 14] : 

Let x be a state at the north pole of the i?-sphere. A measurement in the frame F y = {y, y'} transforms x into 
the mixture p = py + (1 — p)y' with p = cos 2 \6 where 9 is the co-latitude of y. We obtain all possible frames 
by letting y vary over the upper hemisphere. A simple measure of the purity of p is 5 = \2p— 1| = cos 9 which 
is twice its distance from the maximally mixed state in the ci-metric and ranges between for the maximally 
mixed state and 1 for a pure state. The average of any function of the purity g(S) (such as the Shannon 
entropy) over all frames will be given by 

_ = j; /2 g(cos9)(Hin9)«-id9 = g g(6)(l - S^^dS 
9 J* /2 {sm9) R - 1 d9 Jo (1 - 5 2 )(«- 2 )/2(M 
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For R = 2 the right side reduces to J Q g(S)dS. This is quite remarkable, for what it says is that for R = 2 
and only for R = 2 the purity 5 is uniformly distributed on the interval [0, 1]. For R = 1 the negative power 
of (1 — <5 2 ) means that the distribution is weighted towards the pure state value 8 = 1 whereas for R > 2 the 
positive power of (1 — S 2 ) means that it is weighted towards the maximally mixed state 6 = 0. 

This suggests the following choice for our last axiom which picks out R = 2: 



Axiom V: The purities of the mixed states resulting from random measurements of a pure qubit state are 
uniformly distributed. 



As noted above, the appropriate measure of the removal of uncertainty when a measurement is made is the 
information entropy which takes account of the lack of pre-knowledge of what measurement is going to be 
made and requires that we average the Shannon entropy over all possible measurements. The effect of the 
weighting described above in the real case (R = 1) and quaternion case (i? = 4) is that the information 
entropy will be smaller than in the complex case for real quantum mechanics and larger for quaternionic 
quantum mechanics. We verify this by letting g(8) be the Shannon entropy 

g(S) = -(§ + 6) ln(i + 5) - (§ - 5) ln(§ - S) (56) 

and obtain the values 2 In 2 — 1 < 1/2 < 7/12 for R = 1,2,4 respectively. 

Given this result we might have considered replacing Axiom V by elevating to axiomatic status the assertion 
that the information entropy is 1/2 for a pure qubit state. While this "works" to pick out R = 2, the fact 
that the value is 1/2 depends on the choice of the e-base of logarithms in defining the entropy and has no 
instrinsic physical significance.^ In fact one could just as well have chosen the average of any function of the 
purity 6 that differs for different values of R. Axiom V as stated avoids this . 



Comment 7.1. The property of a two dimensional complex Hilbert space expressed by Axiom V is a special 
case of a property of all finite dimensional complex Hilbert spaces proved by Sykora in the appendix to 



namely that if F y x = Ylj^iPjVj denotes a mixed state resulting from a random measurement of a pure state 
in a complex Hilbert space of dimension N, thenp = {pi, ■ ■ ■ ,Pn} * s uniformly distributed on the "probability 
simplex" Y^,j=iPj = 1- Sykora further remarks that a different result is obtained for a real Hilbert space. 
Since we shall deduce the CMQM for N > 2 below from the N = 2 case, we only needed the fact that Axioms 
I-IV implied the R-sphere structure for N = 2 where the points of the probability simplex are linearly related 



to the purity 5. Thus we could easily deduce the formula (55) and from it see the uniqueness of the complex 
case in producing a uniform distribution. 



8 Extension to N > 2 

It has now been shown that Axioms I-V imply that the qbit subspaces of S are CP 1 spaces in which the 
Born Rule holds. In this section it will be shown that this extends to all of S to produce the CRQM. 

To carry this out we must show that the CP 1 coordinatization established for qbits enables us to establish 
a CP N_1 coordinatization for TV dimensional spaces consistent with the Born Rule. To accomplish this we 
first show that Axioms I-V imply that S is a projective geometry. While it is intuitively clear that if the 
two dimensional subspaces of an TV-dimensional projective space are CP 1 spaces the space itself must be a 
CP N ~ 1 space, it is not obvious that a single coordinatization of the space can be carried out in such a way 
that the Born Rule holds simultaneously in every two dimensional subspace. This will be verified by first 
showing that there is a coordinatization such that x(y) = \x* ■ y\ 2 when at least one of the two elements x, y 
lies on an axis and then showing that it extends to arbitrary pairs. 



f I am indebted to W. Wootters for this observation. 
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Lemma 8.0.1. S is a projective geometry in which the elements are points, the two-dimensional subspaces 
are lines, and the three dimensional subspaces are planes. 

Proof. We must show that: 

(I) Each pair of distinct points are on exactly one line; (II) The Veblen- Young axiom holds (see statement 
below) . 



By proposition |4.3| two distinct points a,b determine exactly one two-dimensional subspace V a b- We must 
therefore prove (II) which states that if a, b, c, d are four points no three of which are colinear, and if the lines 
V a b and V c d intersect, then the lines V ac and V d intersect. This is an efficient way of asserting that every 
pair of lines in the plane determined by a pair of intersecting lines will intersect. 

We first prove that if V a b and V c d intersect they lie in a three-dimensional subspace: 

y" 

d 




Figure 2 

Refer to Figure 2. Let V a b and V c d intersect at y. Let y' be its antipode in V a b, and let y* be its antipode in 
V c d- Let y" be the antipode of y' in V y ' y * and let y* 1 be the antipode of y* in V y > y * Since y _L y' and y _L y* 



it follows from proposition 2.3 that y is orthogonal to both y" and y*' . Hence both {y, y' ', y"} and {y, y* , y*'} 



are orthogonal sets . Now let z be any point on V c d- We have p(y, z) + p(y*,z) — 1 whence p(y*', z) = 0. But 



since both {y',y"} and {y* ,y*'} are bases of V y > y * it follows from proposition 2.4 that p(y',z) +p(y",z) = 
p(y* , z) +p(y*' , z). Hence p(y* , z) = p(y' , z) +p(y" , z) whence p(y, z) +p(y' , z) +p(y" , z) = 1. Hence z belongs 
to the three dimensional subspace spanned by {y,y' ,y"}. Since any point on V a b also lies in this space the 
assertion follows. 

Since V a b and V c d he in a three-dimensional subspace, the lines V ac and Vbd he in that subspace, and we can 
therefore complete the proof of the Veblen- Young axiom by showing that any two distinct lines in a three 
dimensional subspace intersect. 




Figure 3 



See Figure 3. With no loss in generality we can label the lines V aa ' and V xx '- Since N = 3 there are unique 
elements a" and x" such that a, a', a" and bases of S. The line V x " a " contains an element x* 

which is orthogonal to x" and hence lies in V xx ' ■ Its antipode x*' is orthogonal to both x* and x" and hence 
is orthogonal to V x * x » which is identical to V x " a "- Hence it is orthogonal to a" so that x*' lies in V aa ' as 
well as in V XX ' ■ This establishes the Veblen- Young axiom and completes the proof. □ 
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Definition 8.1. Two lines in a plane S* of S are said to be normal if the antipode of the intersection in one 
line is orthogonal to its antipode in the other. 



Lemma 8.0.2. Let z 6 V and b € Q where V and Q are normal lines with intersection x. Then b{z) has a 
maximum at b = x. 

Proof. See Figure 4 




Recall equation ( 46 1 



cos 9(a, z) + cos 9(b, z) — 2 cos \6(a, b) cos 9(c, z), 



which was derived from (31 ) under the assumption z £ V a b which is spanned by c, c' so that c(z) + c'(z) = 1. 
Here, however, z is only in V a b when b = x, but since for arbitrary z we have c(z) + c'(z) < 1 we can replace 
the equation by the inequality: 



cos 9(a, z) + cos8(z, b) < 2 cos h9(a, b) cos#(c, z) 



(57) 



with equality if and only if z € V a b- Since V and Q are normal, a is orthogonal both to x and to the antipode 
of x on V (not shown in the figure) and hence by proposition 2.3 is orthogonal to every point of V . Hence 
9(a, b) — it so that the right side of (57) is zero. Let 9(a, z) + 9(z, b) = n — e so that (57) becomes 



cos#(a, z) < cos(9(a, z) + e) 



(58) 



with equality if and only if z € V a b which occurs for b = x, i.e. e = 0. For small, non-zero |e| equation (58) 
implies e < 0, so that 8(a, z)+9(z 7 b) > it. Hence 9(z, b) has a minimum at b = x and hence b(z) = cos 2 \9{b, z) 
has a maximum. □ 

Lemma 8.0.3. Let c be the intersection of a pair of normal lines V and Q in a plane S* . If a G V and 

z € Q, then 

a(z) = a(c)c{z). (59) 



e 



Proof. 




Figure 5 shows a pair of two dimensional subspaces V and Q with intersection c that are normal to one 
another. More precisely it shows the images of these subspaces under the BRC mapping to unit 2-spheres. 
The antipodes of c on V and Q are c' and c" respectively, z is an arbitrary point of Q. Since c' is orthogonal 
both to c and c" it follows from proposition 4.5 that c'(z) = 0. Hence (29) becomes 



a(z) + b(z) = 2Xc(z) where A = a(c) = cos 



2 1, 



(60) 
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where 9 is the angle between the images of a and b which lie on a great circle through c. By varying the angle 
9 we can vary a and b on the circle while holding fixed both the midpoint c of a and b and the midpoint e of 



a and b' . Thus (37) can be written 



and 



a(z) - b{z) = (2 7 - l)K(z), where 7 = \{1 + ^/a{b')) 

K{z) = e(z) - e'(z) 



|(1 + sin |( 



(61) 



is independent of 9. Adding (60 1 and (61): 

a(z) = \K(z) sin \9 + c(z) cos 2 \9. 



(62) 



By Lemma 8.0.2 a(z) has a maximum when a is at the intersection point, i.e. at 9 = 0. It follows that 
-£T(z) = and hence that a(z) — b{z) whence from (60) and (30) we have 



a(z) — a(c)c{z) 



(63) 



□ 



Definition 8.2. Let S have dimension N , and let a = {ai, • • • , aw} be a basis.. The lines qij = V aiaj with 
i =/= j are referred to as the axes of the a.-basis. The set f2(a) is the set of points belonging to all of the axes 
of the Si-basis. 

Lemma 8.0.4. Let S have dimension N , and let a = {cti, • • • ,ajy} be a basis. Let S be a C-P^ 1 space the ele- 
ments of which are represented by N component homogeneous complex coordinate vectors £ = (£1, £j> " " j £,n ) ■ 
There is a mapping z — > z of the elements of S to S such that the Born Rule holds between pairs Zi, Zi provided 
that at least one member belongs to O(a). 



Proof. By induction: The case N = 2 is Proposition 6.1 Assume true for TV — 1. Map the basis elements aj 
to aj with j'th component 1 and the others 0. For j = 1, • • • ,N map the N — 1 dimensional subspace S^' of 
S which is orthogonal to aj to the N — 1 dimensional subspace of S orthogonal to aj. By hypothesis of 
the induction these correspondences can be made in such a way that the Born Rule holds between elements 
provided at least one of them belongs to one of the axes of For each x € we map the line V aj x to 
the two dimensional subspace of S spanned by x and aj which we can do in such a way that the Born Rule 
holds between its elements. 



(0,0,0,1) 




0,0,1,0) 



ai= (1,0,0,0) 



y =(1,5,0,0) 



a 2 -(o,iAo 



Figure 6 



To simplify the next part of the proof let us consider the case N — 4 (see Figure 6) which will make the general 
argument clear. The points of are mapped to 5( 4 ) and so have coordinates of the form x — (l,a, /?, 0). 
(Note that with homogeneous coordinates we can represent the elements in this way. For example a — > 00 
corresponds to the element (0, 1,0,0).) If z lies on 7^042; it will be mapped to a linear combination of x and 
04 and hence to a point z = (1, a, /3, 7) for some 7 G C. Now let y be an element lying on one of the axes of 
say T > ai a 2 which is mapped to y — (1, 6, 0, 0) for some 5 S C By these mappings the Born Rule holds 
for the pair z, x and for the pair x, y. Moreover the lines V a4X and V xy are normal to one another since the 
antipodc of x in V xy lies in and hence is orthogonal to the antipode of x in V aiX . Hence by lemma 
[8^1 

11 4- ry*t)\ 2 

z(y) = z(x)x(y) = \z* ■ x\ 2 \x* ■ y\ 2 = \^ ■ L ; |fm = \z* ■ y\ 2 . (64) 



(l + |a|3 + |0|2 + | 7 |2)( 1 + |f|a) 
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In similar fashion one establishes that the Born Rule holds between every z £ 5 and any clement lying on 
one of the axes of S. □ 



We are now equipped to prove our main theorem: 
Theorem 

Axioms I-V imply that there is a mapping of the elements of S of dimension N to a CP N ~ 1 space such that 
the Born Rule holds between every pair of elements. 

Proof. Choose a basis a\, ■ ■ ■ , of S and perform the mapping described in Lemma |8.0.4| Let b\, ■ ■ ■ , , &jv 
be another basis of S. We have 

N 

^6 j (s) = lforallse«S*. (65) 
3=1 

If s lies on any of the axes Va karL used for the map above the Born Rule holds between s and every element 
and in particular between s and b±, ■ ■ ■ , &/v- Hence: 

N 

5>7-s| 2 = l. (66) 
i=i 

Apply this result when s = dfc for k = 1, • • • , N to obtain 

N 

\b* ■ d k \ 2 = 1 for k = 1, • • • , N. (67) 

j=i 

Also apply it when s is an arbitrary point on the axis V ak a„ for all pairs k, n with k ^= n i.e. for s = acik + (3a n 
where a and j3 are arbitrary. For k — n we have 

N 

5>;-a fc | 2 = i, (68) 



and for k ^ n 



a 



N 

|2 , |a|2\- 



)- 1 5]|S* • (aa fc + /3a n )| 2 = 1 (69) 



The second equation works out to 

|2 N InlO. N 



W+W § lh * j ' afc|2 + H 2 + l/3| 2 § ' "" |2+ (70) 

*B N 

m w? + w ^l-h){a n -l*) = i (7i) 

and since a and /? are arbitrary the sum must vanish for k =/= n. Hence 

N 

^2{a* k ■ bj)(a n ■ bj) = 5kn (72) 
j=i 

Thus the matrix M^j — a* k ■ bj is unitary. Hence the coordinates are transformed by a unitary transformation 
under a change of basis. Given any pair of elements we can choose a basis such that they lie on an axis 
and hence such that the Born Rule holds between them. But the scalar product is invariant under unitary 
transformations whence it follows that the Born Rule remains valid in every basis. □ 

We have now completed proof that Axioms I-V imply the CRQM. In the usual Dirac notation we have 
established that pure states x in S obeying these axioms correspond to kets |a;) such that the Born Rule 
x(y) = \ (x\y}\ 2 holds. 
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9 Discussion 



While our derivation of the CRQM from Axioms I-V has been lengthy, we have obtained the result with no 
dimensional restriction and without having to use Gleason's Theorem which is itself quite lengthy. Moreover 
our inductive proof of the extension to N > 2 has gotten the result with no more than the rudiments of 
projective geometry, in particular avoiding the proof that if the planes of an iV-dimensional projective space 
are CP 2 the space is CP N ~ 1 [l4j. 

A key role in the axiomatic system presented here was played by Axiom IV, motivated by the entropic 
Turing-von Neumann effect in which the loss of purity of a state resulting from a measurement is reduced 
by an intermediate measurement. It is remarkable that the impurity can be made arbitrarily small by a 
sufficiently large number of intermediate measurements. Indeed, one can reproduce unitary (Schrodingcr) 
dynamics with arbitrarily high accuracy in this way. For consider a sequence of measurements performed on 
a system in frames containing states \n) which are related by \n) — e~ lT \n — 1) for some Hamiltonian H 
and time interval r. The probability p n for finding the system in state \n) after n measurements starting 

with |0) is at least \(n\n - l)| 2 p„-i. If rA << 1, where A = ((0|# 2 |0) - (0|#|0) 2 )2 is the dispersion of H 
we have 

|Hn-l)| 2 «l-(rA) 2 , (73) 

whence 

1>P„>(1-(tA) 2 )". (74) 

At a time T, when n = T/t measurements have been made, the right side approaches e~ TTA . Thus given 
any finite time T and dispersion A one can choose a sufficiently small interval t between measurements that 
the initial state |0) is "coaxed" by the sequence of measurements into a mixed state so dominated by \n) 
that the entropy increase is arbitrarily small. (The so-called "watchdog" effect or "quantum Zeno effect" 
occurs when the repeated measurements are made in a frame containing the initial state which supresses the 
evolution altogether.) 

Thus, while collapse due to measurement cannot be reproduced by unitary dynamics, a fact that gives rise 
to the measurement problem, we see that the converse is not true, i.e. unitary dynamics can be reproduced 
to arbitrary accuracy by a sequence of collapses. It is thus theoretically possible that what appears to us 
as Schrodinger evolution is a very good approximation to a process in which an interaction Hamiltonian 
"guides" a sequence of collapse processes happening during very small time intervals. By choosing r small 
enough no increase in entropy would be detected even on a cosmic time scale. 



10 Appendix 



For conventient reference we here reproduce the derivation 13 15 of the relationship between p and the 



d-metric in the conventional model and local hidden variable theories. 

In a local hidden variable theory one has a set A with a measure fj, such that 

p(i ) j/)=/i(A(i)nA(j,)) ) /i(A(aO) = l,Va;. (75) 

To evaluate the d-metric we must compute the supremum over z of. \fj,(A(x) fl A(z)) — /i(A(y) n A(z))|. But 
we note that the contribution coming from any overlap of A(x) and A(y) will cancel. Hence one can compute 
the z maximizing the expression as if the sets are disjoint. This occurs when either z — x or z — y and gives 
1 - n(A(x) l~l A(j/)) whence 

d(x, y) = 1 - p(x, y) (76) 

which disagrees with pi] ). 

In the CRQM we have (reverting to Dirac notation): 

p(x,y) = \(x\y)\ 2 = Tr(ir(x)ir(y)), ir(z) = \z)(z\, (77) 

whence 

d(x,y) = sup \Tr{-K{x)Tr(z)) - Tr (Tr(y)x(z))\ = sup |(z|7r(a;) - 7r(y)|z)|. (78) 
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But this is just the largest eigenvalue of n(x) — 7r(y). Since the 7r's are projectors: 

(tt(x) - ir(y)f = (1 - \(x\y)\ 2 )(n(x) - n(y)) 
and one reads off the largest eigenvalue to obtain (41). 



(79) 
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